galtonfamiliesmain.csv

galtonfamiliessub.csv

galtonparentheights.csv

galtonfamiliesnotebook.csv

See also Pearson Height Dataset and Anthropometric Dataset

Description

Francis Galton, a cousin of Charles Darwin, studied the relationship between parent heights and the heights of their offspring. From his original article on regression, cited below: “My data consisted of the heights of 930 adult children and of their respective parentages, 205 in number. In every case I transmuted the female statures to their corresponding male equivalents and used them in their transmuted form… The factor I used was 1.08, which is equivalent to adding a little less than one-twelfth to each female height. It differs a very little from the factors employed by other anthropologists…”

The galtonfamiliesmain dataset was created under the direction of Dr. James A. Hanley from Galton’s original paper notebooks. Eight families were left out for illustrative purposes. The “female statures” are in their raw (untransmuted) form. Information about the eight families is found in the galtonfamiliessub dataset. The galtonparentheights dataset contains just the heights of the parents.

Variables—Main Dataset

Rows: 898
Columns: 6
$ FamilyID <chr> "1", "1", "1", "1", "2", "2", "2", "2", "3", "3", "4", "4", "…
$ Children <dbl> 4, 4, 4, 4, 4, 4, 4, 4, 2, 2, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6…
$ Father   <dbl> 78.5, 78.5, 78.5, 78.5, 75.5, 75.5, 75.5, 75.5, 75.0, 75.0, 7…
$ Mother   <dbl> 67.0, 67.0, 67.0, 67.0, 66.5, 66.5, 66.5, 66.5, 64.0, 64.0, 6…
$ Child    <chr> "Son", "Daughter", "Daughter", "Daughter", "Son", "Son", "Dau…
$ Height   <dbl> 73.2, 69.2, 69.0, 69.0, 73.5, 72.5, 65.5, 65.5, 71.0, 68.0, 7…
# A tibble: 6 × 6
  FamilyID Children Father Mother Child    Height
  <chr>       <dbl>  <dbl>  <dbl> <chr>     <dbl>
1 1               4   78.5   67   Son        73.2
2 1               4   78.5   67   Daughter   69.2
3 1               4   78.5   67   Daughter   69  
4 1               4   78.5   67   Daughter   69  
5 2               4   75.5   66.5 Son        73.5
6 2               4   75.5   66.5 Son        72.5

Variables—Eight Families

Rows: 36
Columns: 6
$ FamilyID <dbl> 13, 13, 50, 50, 84, 84, 84, 84, 84, 111, 120, 120, 120, 120, …
$ Children <dbl> 2, 2, 2, 2, 4, 4, 4, 4, 4, 1, 11, 11, 11, 11, 11, 11, 11, 11,…
$ FatherR  <dbl> 13.0, 13.0, 11.0, 11.0, 10.5, 10.5, 10.5, 10.5, 10.5, 9.0, 9.…
$ MotherR  <dbl> 7.0, 7.0, 5.4, 5.4, 3.0, 3.0, 3.0, 3.0, 3.0, 3.5, 2.0, 2.0, 2…
$ Child    <chr> "Son", "Daughter", "Son", "Daughter", "Son", "Son", "Son", "D…
$ HeightR  <dbl> 11.0, 2.0, 13.0, 2.0, 10.0, 8.5, 5.5, 5.5, 3.5, 5.5, 12.0, 10…
# A tibble: 6 × 6
  FamilyID Children FatherR MotherR Child    HeightR
     <dbl>    <dbl>   <dbl>   <dbl> <chr>      <dbl>
1       13        2    13       7   Son         11  
2       13        2    13       7   Daughter     2  
3       50        2    11       5.4 Son         13  
4       50        2    11       5.4 Daughter     2  
5       84        4    10.5     3   Son         10  
6       84        4    10.5     3   Son          8.5

Variables—Main Parents Only

Rows: 205
Columns: 3
$ FamilyID <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18…
$ Father   <dbl> 78.5, 75.5, 75.0, 75.0, 75.0, 74.0, 74.0, 74.0, 74.5, 74.0, 7…
$ Mother   <dbl> 67.0, 66.5, 64.0, 64.0, 58.5, 68.0, 68.0, 66.5, 66.0, 65.5, 6…
# A tibble: 6 × 3
  FamilyID Father Mother
     <dbl>  <dbl>  <dbl>
1        1   78.5   67  
2        2   75.5   66.5
3        3   75     64  
4        4   75     64  
5        5   75     58.5
6        6   74     68  

References

Random Services: Galton’s Height Data

Galton, Francis. (1886). Regression toward mediocrity in hereditary stature. The Journal of the Anthropological Institute of Great Britain and Ireland, 15, pp. 246-263.